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C/^ . Abstract 

We study algorithms for spectral graph sparsification. The input is a graph G with n vertices and 
m edges, and the output is a sparse graph G that approximates G in an algebraic sense. Concretely, for 
all vectors x and any e > 0, G satisfies 

Q, (1 - e)x 1 L G x < x 1 Lqx < (1 + t)x L Lqx, 

^ | where Lq and arc the Laplacians of G and G respectively. 

We show that the fastest known algorithm for computing a sparsifier with 0{n log n/e 2 ) edges can 
actually run in 0(m log 2 n) time 1 , an 0(log n) factor faster than before. We also present faster sparsifica- 
tion algorithms for slightly dense graphs. Specifically, we give an algorithm that runs in 0(m log n) time 
and generates a sparsifier with 0(n log 3 n/e 2 ) edges. We also give an 0(m) time algorithm for graphs 
CN| ■ with more than nlog 5 n(loglogn) 3 edges of polynomially bounded weights, and an O(m) algorithm for 

unweighted graphs with more than nlog 8 n(loglogn) 3 edges. The improved sparsification algorithms 
are employed to accelerate linear system solvers and algorithms for computing fundamental eigenvectors 
0^ ' of slightly dense SDD matrices. 

O! 

1 Introduction 
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The efficient transformation of dense instances of graph problems to nearly equivalent sparse instances is 
a very powerful tool in algorithm design. The idea, widely known as graph sparsification, was originally 
introduced by Benczur and Karger [3] in the context of cut problems. Spielman and Teng [13] generalized 
f3 . the cut-preserving sparsifiers of Benczur and Karger to the more powerful spectral sparsifiers, which preserve 
in an algebraic sense the Laplacian matrix of the dense graph. The main motivation of spectral sparsifiers 
was the design of nearly-linear time algorithms for the solution of symmetric diagonally dominant (SDD) 
linear systems. A matrix A is SDD if it is symmetric and for all i, An > \ Aij\- 

Benczur and Karger proved that, for arbitrary e, cuts can be preserved within a factor of 1 ± e by 
a graph with 0(n log n/e 2 ) edges. This graph can be computed by a randomized algorithm that runs in 
0(m log 3 n) time 2 , where m is the number of edges in the dense graph. Spielman and Teng gave the first 
construction of spectral sparsifiers, but the edge count of these objects was several log factors bigger than 



1 We use the OQ notation to hide one log log n factor which in most cases it is due to the guarantees of the best currently 
known algorithm for computing low-stretch trees [1]. In the main part of the paper we provide a more accurate accounting of 
the running time. 

2 A11 sparsification algorithms in this paper are randomized with a probability failure inversely proportional to n. They 
consist of a preprocessing phase followed by the generation of the sparsifier which in general can be performed in time 
proportional to the number of edges in it (e.g. 0(n log n/e )). For the sake of conciseness our running time statements will 
include only the time for preprocessing and will omit the failure probability. 
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that of Benczur and Karger's cut-preserving sparsifiers. However, recent progress that we review below 
allows now for the construction of spectral sparsifiers with 0(n log n/e 2 ) edges in 0(m log 3 n) time. 

Sparsification can be employed to immediately accelerate algorithms for numerous problems. In several 
cases and depending on the density of the instance, the sparsification routine dominates the running time of 
the sparsifier-enhanced algorithm. This is a strong incentive for speeding up the construction of sparsifiers 
even further. 

This problem was undertaken in the context of cut-preserving sparsifiers by Fung et al. [5]. Improving 
upon the work of Benczur and Karger, they proved that there is an 0(m log 2 n) time algorithm that 
computes a sparsifier with 0(n log n/e 2 ) edges. This stands as the fastest known algorithm with this 
sparsity guarantee for general graphs. However, Fung et al. also showed that we can do even better on 
slightly more dense graphs. More concretely, they proved that there is an 0(m) time algorithm that 
computes a sparsifier with 0(n log 2 n/e 2 ) edges. Note that by transitivity, a combination of the two 
algorithms can produce a graph with 0(n log n/e 2 ) edges in 0(m + n log 4 n) time. In other words, there is 
a linear time sparsification algorithm for graphs with more than ?ilog 4 n edges. 

This leads us to the main question we address in this paper: Is something analogous possible for spectral 
sparsification? We answer the question in the affirmative. We first show that a slight modification of the 
Spielman-Srivastava algorithm [12] can improve the run time to 0(mlog 2 n). This nearly matches the 
general case algorithm of [5]. We present three additional sparsification algorithms. The first is a variation 
of the Spielman-Srivastava algorithm that generates a sparsifier with 0(n log 3 n/e 2 ) edges in 0(m log n) 
time. The second produces a sparsifier with 0(n log n/e 2 ) edges in 0(m) time, assuming the input has 
more than ?ilog 5 n(loglogn) 3 edges whose weights are bounded by a polynomial in n. The third produces 
a sparsifier with 0(n log n/e 2 ) edges in 0{m) time assuming the input is unweighted and has more than 
n log 8 n(log log n) 3 edges. 

Applications in numerical algorithms 

The (1 ± e)-sparsifiers we obtain can be employed in a standard way as preconditioners for SDD linear 
systems, giving us faster solvers for slightly dense graphs: (i) an 0(m) time solver for systems with more 
than nlog 5 n(loglogn) 3 non-zero entries and (ii) an 0(m) time solver for Laplacians of unweighted graphs 
with more than n log 8 n (log log n) 3 non-zero entries. The best previously known algorithm [11] runs in 
0(m log nlog(l/<5)) time. 

In addition, our sparsification algorithms accelerate the computation of an approximate Fiedler eigen- 
vector of a graph Laplacian Lq. An (1 + e)-approximate eigenvector is a unit norm vector x such that 
x T Lqx is within a factor 1 + e from the eigenvalue A2 of Lq. The algorithm consists of two steps: 
(i) computing a spectral sparsifier G that (1 ± e/2)-approximates the input graph G. (ii) computing a 
(1 + e/3)-approximate eigenvector of G; this will automatically be an (1 ± e)-approximate eigenvector of 
the (more) dense input graph because the spectral sparsification step preserves the eigenvalues of G within 
1 ± e/2. Hence combining our sparsification algorithms with the inverse power method [14] (which consists 
of solving 0(lognlog(l/e)) systems in L G ) gives an approximate eigenvector in 0(m + nlog 5 nlog(l/e)/e 2 ) 
time. The fastest previously known algorithm runs in time 0(m log 2 n log(l/ e)) . The same result applies to 
the computation of the Fiedler eigenvector of a normalized Laplacian D~ 1 / 2 LgD~ 1 / 2 ; applying the inverse 
power method on D~ l l 2 LqD~ 1 I 2 gives the required eigenvector. 

We note here that one practical application of eigenvectors is in partitioning algorithms; the analysis of 
Cheeger's inequality [4] tells us how to turn an approximate Fiedler vector into a partition. Hence, we give 
an improvement to the running time of a fundamental graph partitioning algorithm. Finally we note that 
the computation of additional eigenvectors can be performed in the same amount of time (per vector) by 
restricting the action of the matrix to the complement of the subspace spanned by the previously computed 
eigenvectors. 
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2 Overview of our techniques 

2.1 Brief background on spectral sparsification 

The first algorithm for edge-efficient spectral sparsifiers was given by Spielman and Srivastava [12]. Their 
algorithm produces a sparsifier with 0(n log n/e 2 ) edges in a very elegant way: it samples edges with 
replacement. The probability of sampling an edge is proportional to its weight multiplied by its effective 
resistance in the resistive electrical network associated with the given graph. 

Computing the effective resistance of a given edge requires — almost by definition — the solution of a 
linear system on the graph Laplacian. 3 However, Spielman and Srivastava also provided a way of estimating 
all m effective resistances, via solving O(logn) SDD linear systems. This holds under the assumption that 
the SDD solver is direct, i.e. it outputs an exact solution. The use of a nearly-linear time iterative solver that 
computes approximate solutions introduces an additional source of imprecision; Spielman and Srivastava 
showed that solving the systems up to an inverse polynomial precision is sufficient for sparsification. This 
brings the running time of their algorithm to 0(m log c+2 n), where c is the constant appearing in the 
running time of the SDD solver. 

2.2 The 0(m log 2 n) time algorithm 

While the work of Spielman and Srivastava did not improve the running time of the SDD solver, it proved 
to be a decisive step towards the fast SDD solver of Koutis, Miller, and Peng [10, 11], which runs in time 
0(m log nlog(l/5)), where 5 is the desired precision. Using this solver in the Spielman and Srivastava 
sparsification sampling scheme immediately yields an 0(m log 3 n/e 2 ) time algorithm. This brings us to the 
first contribution of this paper, a tighter analysis of the Spielman and Srivastava algorithm. In Section 5 we 
show that solving the systems up to fixed precision is actually sufficient for sparsification. This decreases 
the running time to 0(m log 2 n/e 2 ). 

2.3 Faster algorithms: The main idea 

To get our two faster algorithms, we will trade accuracy in the computation of effective resistances for 
speed. The idea is to transform the input graph G into another graph H where effective resistances can be 
computed faster while still providing good bounds for the true effective resistances in G. These approximate 
effective resistances can still be used for sparsification at the expense of additional sampling [10] that yields 
slightly more dense sparsifiers. These sparsifiers can be re-sparsified to 0(n log n/e 2 ) edges by applying 
the fast general-case algorithm. 

2.4 The Oim log n) time algorithm 

The 0(m log n) time algorithm is based on the observation that the Spielman-Srivastava scheme can be 
implemented to run in O(mlogn) time on a spine-heavy approximation H of G. The spine-heavy graph H 
is derived in 0(m log n) time from G by computing a low-stretch tree of G and scaling it up by a (9(log 2 n) 
factor. In [11] it was shown that linear systems involving the Laplacian of H can be solved in 0(m) time, 
enabling the faster implementation of the Spielman-Srivastava scheme on H. At the same time the effective 
resistances in H are at most a 0(log 2 n) factor smaller than those in G. Sampling with respect to these 
estimates, allows us to get a sparsifier G' with 0(n log 3 n/e 2 ) edges. Re-sparsifying G' gives a sparsifier G 
with 0(n log n/e 2 ) edges in 0(n log 5 n) time. The details are given in Section 5. 

2.5 The 0(m) and 0{m) time algorithms 

The starting point for our fastest algorithm is an 0{m) time algorithm for computing an approximate 
sparsifier, i.e. a sparse approximation for the input graph, but of moderate quality. We will actually do 
this by sparsifying a graph H that is a K-approximation of G. Then we observe that with some additional 

3 Laplacian matrices are SDD. 
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work we can 'leverage' the approximate sparsifier to compute a sparsifier for G as well. Indeed, let H be a 
sparsifier of H with 0(n log n) edges. Then we generate a low-stretch spanning tree T of H in 0(n log 2 n) 
time and approximate the effective resistances of G over T in 0{m) time. We will be able to claim that 
these approximate values are enough to generate a sparsifier G' for G with 0{uk log 3 n) edges. Finally 
from G' we can compute a sparsifier G with 0(nlogn/e 2 ) edges in 0(nn log 5 n(log log n) 2 ) time using our 
first algorithm. The fact that we can tightly estimate the resistances in G using a tree T is a departure 
from the Spielman-Srivastava scheme and may be of independent interest. 

We will derive our 0(m) algorithm via a single application of the above 'leveraging' idea for k = 
0(log 3 n). To improve performance for sparser graphs we will progressively sparsify a sequence of t = 
O(loglogn) graphs H = Hq, H\, . . . , Ht = G, such that Hi is a 2-approximation of Hi + \\ given the 
sparsifier for Hi we can construct the sparsifier for Hj+i within the claimed time. The details are given in 
Section 6. 

3 Background on spectral graph theory and sparsification 

3.1 The graph Laplacian and its pseudoinverse 

Let G = (V, E, w) be an undirected weighted graph on n vertices, which we identify with the integers 
{1,2, ... ,n}, and m edges, where the weight of edge e is given by w e . Without loss of generality we will 
assume that minimum weight is 1. We will also assume that matrices are represented as adjacency lists. 

The Laplacian of G is denoted by Lq. It is a symmetric n x n matrix with zero row and column sums, 
where the off-diagonal entry is given by —wu^ if is an edge of G and otherwise. The ith 
diagonal entry is given by the weighted degree of vertex i. 

If G is a connected graph, then Lq is a matrix of rank n — 1, with its kernel spanned by 1 (the vector 
of all l's). We let L G denote the Moore- Penrose pseudoinverse of Lq; this is a matrix that acts as the 
inverse of Lq on (ker Lq)^ , and satisfies L g Lq = LqL g = I n -i, where I n -i is the projection onto the 
(n — l)-dimensional image of Lq. 

Given the one-to-one correspondence of graphs and their Laplacians we will often apply algebraic 
notation to graphs, with the obvious meaning. 

3.2 Spectral approximation and sparsification 

In this paper we concentrate on symmetric diagonally dominant matrices. For two matrices A and B of 
the same dimension, we write A < B if x T Ax < x T Bx for all vectors x. For two graphs G and H, we write 
G ^ H if the Laplacians satisfy Lq < Ljj- 

Definition 3.1 We say that a graph H is a k- approximation of a graph G if G ^ H ^ kG. 
It is not hard to show that if H is a graph that ^-approximates a graph G then we have 

-L% <L\<L G (3.1) 

Definition 3.2 Given a graph G, we say that a (sparser) graph H is a lie spectral sparsifier of G if 

(1 - e)G < H < (1 + e)G. (3.2) 

It is easy to see that if H is a 1 ± e spectral sparsifier of G then j^-H is a graph that ^^-approximates 
G. By the definition, it is also easy to verify transitivity. If G\ is a 1 ± e\ sparsifier of G and G2 is a 
1 ± 62 of G\ then G2 is a (1 db ei)(l ± £2) sparsifier of G. 
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3.3 Graphs as resistive electrical networks 

We can consider our graph G as an electrical network of nodes (vertices) and wires (edges), where edge e 
has resistivity of w~ l Ohms. 

In this context it is very useful to give another definition of the Laplacian L G , in terms of its incidence 
matrix B G . To define Bq, fix an arbitrary orientation for each edge in G. For a vertex i let Xi t> e its 
(n x 1) characteristic vector, with a 1 at the ith entry and O's everywhere else. Let e = be an edge 
and define b e = Xi ~ Xj- Then Bq is the m x n matrix whose eth row is the vector b e . Let W G be the 
m x m diagonal matrix whose eth diagonal entry is w e . With these definitions, it is easy to verify that 

L G = B T G W G B G = Y J Webeb T e . 

For notational convenience, we will drop the subscripts on Lq, Bq, and W G when the graph we are 
dealing with is clear from context. 

Going back to the electrical analogy, the effective resistance between vertices i and j, denoted by 
,j) or R G (e) when (£, j) is an edge e, is the voltage difference that has to be applied between i and j 
in order to drive one unit of external current between the two vertices. Algebraically it is given by 

R G (i,j) = (Xi ~ Xj) T L G ( X i ~ Xj) (3.3) 

The above equation allows us to apply (3.1) and see that 

G<H<kG^ (l/K)R G (e) < R H {e) < R G (e). (3.4) 

The definition of the effective resistance for (i,j) in (3.3) shows directly that it can be computed by 
solving the system L G x = (xi ~ Xj)- I n light of this, (3.4) will be of central importance in our proofs. 
Informally, it states that if H is a ^-approximation of G, then the effective resistance of any edge in G 
can be approximated by the effective resistance of the same edge in H, which can be done by solving the 
system Ljjx = (xi ~ Xj)- This will allow us to construct special approximations H for which solving with 
Lh is easier than with L G . 

3.4 Low-stretch subgraphs, spine-heavy graphs and SDD solvers 

Let S be a graph on the same vertex set with a graph G. Let e = (i,j) be an edge of G. If p is a path 
ei, e2, . . . , e v between i and j in S we say that the stretch of e over p is stretch p (e) := w e X^=i w ei ■> l - e - 
the weight of e multiplied by the sum of inverse weights of tree edges on the path from i to j. If V{e) is 
the set of all paths between i and j in S we define 

stretche(e) = min stretch„(e). 

We will use the term stretch of e over S for stretcli5(e) The definition is simpler when T is a tree. In this 
case there is a unique path between the endpoints of e. We denote by stretchs(G) the sum of stretches in 
S of all edges of G, i.e. 

stretct^G) = stretcli5(e). 

egG 

It is known that every graph G has a spanning tree T with stretchr(G) = 0{m log n log log n), known 
as a low-stretch tree. The tree can computed in 0{m log n log log n) time [1]. Because these guarantees 
are still open to improvement we will state our results with respect to two parameters: We will denote by 
T m the time required for computing a low-stretch tree on a graph with m edges, and by Sf the factor in 
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excess of 0(m log n) in stretchr(G') provided by the 0(T m ) time algorithm. That is, as noted above, the 
best current guarantees are Sf = O (log log n) and T m = 0(m log n log log n). 

We call a graph spine-heavy if it has a spanning tree with stretch^(G) = 0(m/ log n). Given a graph 
G we can compute a spine-heavy graph H that 0(sf log 2 n)-approximates it by computing a low-stretch 
tree and then scaling up the weights of tree edges in G by the 0(sf log 2 n) factor. This is summarized in 
the following lemma. 

Lemma 3.3 For every graph G with n vertices there is a spine-heavy graph H that 0(sf log 2 n)- approximates 
G. The graph H can be constructed in time dominated by the computation of a low-stretch tree for G. 

Finally we state a lemma that summarizes the recent work on fast SDD solvers [11]. 

Lemma 3.4 Let A be an SDD matrix. There is a symmetric operator As such that 

(1 - 8)A < A 5 ■< (1 + 5)A 

and that for any vector b, the vector A~$b can be evaluated in 0(J~ m + sfm log n log (1/6)) time. Moreover, 
if A is the Laplacian of a spine-heavy graph and its low-stretch tree is given, then A^b can be evaluated in 
0(mlog(l/6)) time. 

3.5 Sampling for sparsification 

In a remarkable work, Spielman and Srivastava [12] analyzed a spectral sparsification algorithm based on 
a simple sampling procedure. The procedure will be central in our algorithms and we review it here. It 
takes as input a weighted graph G and frequencies p' e for each edge e. These frequencies are normalized to 
probabilities p e summing to 1 . It then picks in q rounds exactly q samples which are weighted copies of the 
edges. The probability that given edge e is picked in a given round is p e . The weight of the corresponding 
sample is set so that the expected weight of the edge e after sampling is equal to its actual weight in the 
input graph. The details are given in the following pseudocode. 



Sample 

Input: Graph G = (V, E, w), p' : E 
Output: Graph G' = (V, C, w'). 



EePe 



C s ilogi/e 2 (* Cs is an explicitly known constant *) 
Pe ■= Pe/t 

G' := (V, C, w') with C = 

for q times do 

Sample one e £ E with probability of picking e being p e 
Add sample of e, / to C e with weight w', = w e /(p e q) 

end for 

For all I E £, let w[ := w'Jq 
return G' 



Spielman and Srivastava analyzed the case when p' e = w e R G (e), where R G (e) is the effective resistance 
of e in G. The following generalization characterizes the quality of G' as a spectral sparsifier for G. It is 
shown in [8] and it was originally proved with a weaker success guarantee in [10]. 
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Theorem 3.5 (Over sampling) Let G = (V,E,w) be a graph. Assuming that p' e > w e Rc{e) for each 
edge e G E the graph G' = Sample(G,p') is a (1 rt e) sparsifier of G with probability at least 1 — 1/n 2 . 

4 The general case: An 0(miog 2 n) time algorithm 

As we discussed above, Spielman and Srivastava [12] use the Sample algorithm with p' e = w e R G (e). For 
the efficient implementation of their algorithm they first obtain a different expression for the effective 
resistance, via a simple algebraic manipulation: 

R G (hj) = (Xi-Xj) T L + (xi-Xj) 

= Ui-XjfL+LL+ixi-Xj) 

= (Xi ~ X J ) T L + B T W 1 / 2 W 1 / 2 BL + ( Xt - X 3 ) 

The advantage of this definition is that it expresses the effective resistance as the squared Euclidean 
distance of two points, given by the ith and jth column of the matrix W 1 I 2 BL + . This new expression still 
involves the solution of a linear system with L. The natural idea is to replace L with an approximation 
L$ satisfying the properties described in Lemma 3.4. So instead of R G (i,j) we compute the quantities 

R G (i,j) = \\W 1 / 2 BL+(xi-Xj)\\ 2 . 

Of course, there are still m systems to be solved. To work around this hurdle, Spielman and Srivastava 
observe that projecting the vectors to an 0(log n)-dimensional space preserves the Euclidean distances 
within a factor of l±e/8, by the Johnson- Lindenstrauss theorem. Algebraically this amounts to computing 
the quantities WQW 1 / 2 B L~g (xi ~ Xj)l| 2 5 where Q is a properly defined random matrix of dimension k x m 
for k = O(logn). The authors invoke the result of Achlioptas [2], which states that one can use a matrix 
Q each of whose entries is randomly chosen in {il/^/k}. 

The construction of the sparsifiers can can thus be broken up into three steps. 

1. Compute QW l / 2 B. This takes time 0(km), since B has only two non-zero entries per row. 

2. Apply the linear operator Lf to the k columns of the matrix (QW 1 / 2 B) T ' , using Lemma 3.4. This 
gives the matrix Z = QW l / 2 BL^ . 

3. Compute all the (approximate) effective resistances (time 0(km)) via the square norm of the differ- 
ences between columns of the matrix Z. Then sample the edges. 

4.1 The 0(m log 2 n) time algorithm 

Spielman and Srivastava prove that the approximations R G (i,j) can be used to obtain the sparsifier if they 
satisfy 

(l-e/4)R G (i,j) < R G (i,j) < (l + e/A)R G (i,j). 

Then they show that this can be satisfied if 5, the accuracy guarantee of the linear system solver, is taken 
to be an inverse polynomial in n. Thus their algorithm is dominated by the second step (the applications 
of L^) and takes time 0(T m + s/m log 3 nlog(l/e)). 

The following lemma shows that in fact it is enough to take 5 to be a constant. Furthermore, our proof 
significantly simplifies the corresponding analysis of [12]. 

Lemma 4.1 For a given e, if L satisfies (1 — 5)L ^ L ^ (1 + 5)L where 5 = e/8, then the approximate 
effective resistance values R G (u,v) = \\W 1 ' 2 BL + ( Xu — Xv)\\ 2 satisfy: 

(1 - e)R G (u, v) < R G (u, «)<(! + e)R G (u, v). 



7 



Proof. We only show the first half of the inequality, as the other half follows similarly. Since L and 
L have the same null space, by (3.1) the given condition is equivalent to: 

L+ + L+ -< — L. 



1 + 5 ~ ~ 1-5 
Since j^L + + L + , we have 

R G (u, v) = (xu ~ Xv) T L + (xu ~ Xv) 

< (l + 5)(xu-Xv) T L+(xu-Xv) 

= (1 + 5)( X u ~ Xv) T l + LL+(xu ~ Xv)- 

Applying the fact that L ^ (1 + 5)L to the vector L + {x u — Xv) in turn gives: 

R G (u, v) < (1 + 5f(xu - Xv) T L + LL + (xu - Xv) 

= (1 + 6) 2 \\W 1 / 2 BL+( Xu - xv)\\ 2 = (1 + 5) 2 R G (u, v) 

The rest of the proof follows from < 1 — e/4 by choice of 5. □ 

This proves our first theorem. 
Theorem 4.2 There is a 1 ± e sparsification algorithm that runs in 0(7~m + s/mlog 2 nlog(l/e)) time. 
5 The Oim log n) time algorithm 

Informally, the oversampling Theorem 3.5 states that if we use estimates to the effective resistances, rather 
than the true values, the Spielman-Srivastava scheme still works; but in order to produce the sparsifier we 
have to compensate by taking more samples. We exploit this in our second Theorem. 

Theorem 5.1 There is a (1 ± e) -sparsification algorithm that runs in 0{T m + mlognlog(l/e)) time and 
returns a sparsifier with 0(sfn log 3 n/e 2 ) edges. As a result, we can compute an (1 ± e)-sparsifier with 
0(ralogn/e 2 ) edges in 0(J~ m + m log n log(l/e) + s 2 n log 5 n) time. 

Proof. Given the input graph G we construct a spine-heavy graph H that 0(sf log 2 n)-approximates 
G. The construction can be done in 0(J~ m ) time, by Lemma 3.3. We then run the Spielman-Srivastava 
scheme (Section 4) on H to approximate the effective resistances R H (i,j) within a factor of 1 ± e. Step 2 
of the Spielman-Srivastava scheme runs in 0{m log nlog(l/e)) time on H, by Lemma 3.4. We adjust the 
approximate effective resistances in H down by a factor of 1 + e to accommodate for the upper side of the 
error in Lemma 4.1. Then, by (3.2) the calculated approximate effective resistances satisfy 



R°(i,j)<R H (i,j)<R G (i,j). 



0(sf log 2 n) 

So Theorem 3.5 applies if we take p' e = 0(sf log n)w e R H (iJ). We have 

J2p'e = 0(s f log 2 n) w e R H (i, j) <0(s f log 2 n) ^ w e R G (i,j) = 0(s f n log 2 n). 

e e e 

The last equality follows from the fact that ^ e w e R G (i, j) = n — 1 for any graph G (e.g. see [12]). Hence the 
total number of samples we need to take in order to produce an (1 ± e)-sparsifier is 0(s jnlog 3 n/e 2 ). The 
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second sparsifier is computed by re-sparsifying with the general case algorithm (and appropriate settings 
for e). □ 

6 The 0(m) and 0(m) time algorithms 
6.1 The base approximate sparsifier 

We will show that for every graph G we can in 0(m) time compute an 0(sf log 4 n)-approximation with 
0{n log n) edges, i.e. an approximate sparsifier H of G. To find the approximate sparsifier we will first 
identify an 0(sf log 4 n)-approximation H of G, which in turn can be sparsified in 0(m) time. 
Consider for the moment the following Lemma. 

Lemma 6.1 Let G be a graph with n vertices and m > 2n edges. Let T be a low-stretch tree of G and 
let H = G + 0(s 2 j log 4 n)T . Given T, a 4- approximation of H with 0(n log n) edges can be computed in 
0(m + s /ralog 2 n) time. 

Proof. In [10] it was shown that applying Sample on H, with p' e taken to be the stretch of e over the 
tree T, produces a graph L with n + m/(s/log 2 n) edges such that I is a 2-approximation of H. With 
the general case algorithm we can - in time 0(m + s/nlog 2 n) - compute a 2-approximation L of L with 
O(nlogn) edges. By transitivity / is a 4-approximation of H. □ 

The graph H in the above Lemma comes close to our requirements. The main problem lies in the speed 
of the low-stretch tree computation: the fastest known algorithm for computing a low-stretch tree requires 
0(sfm log n) time, and we would like to compute H faster. To work around this hurdle we will replace the 
low-stretch with a low-stretch subgraph S of G which can be computed in 0(m) time. For this, we will 
use spanners. A i-spanner of a graph G is a subgraph S of G that preserves distances within a factor of t. 
It follows that if G is unweighted and S is a log n-spanner, then for every edge e of G there is a path of 
length O(logn) in S. That is for each e, we have 

stretchy (e) = O(logn). 

We will use the following Lemma from [7]. 

Lemma 6.2 Given an unweighted graph G, a O '(log n)- spanner S with with 0{n) edges can be computed 
in 0{m) time. 

We then get the following. 

Lemma 6.3 Let G be an unweighted graph graph with n vertices. We can compute a 0(s /log 4 n)- 
approximation H of G and a A- approximation of H with 0{n log n) edges in in 0(m + sfn log 3 n) time. 
The graph H is of the form G + 0(sj log 4 n)S where S is a spanner of G. 

Proof. Let S be the spanner of G provided by Lemma 6.2. The proof mirrors that of Lemma 6.1, with 
some minor changes. First notice that the graph S gets scaled up by a factor sf smaller than the tree 
in Lemma 6.1. This reflects the fact that the average stretch over S is Sf times smaller than over T. To 
compute the required sparsifier, we first find a graph / with 0(nlogn + m/(sj log 2 n)) edges which is a 
2-approximation of H. The only difference is that we now compute / by applying Sample with uniform 
frequencies p' e , because the stretch of all edges is bounded above by O(logn). Then using our general case 
algorithm we can -in time 0(m + sjn log 3 n)— compute a 2-approximation L of / with 0(n log n) edges. 
By transitivity / is a 4-approximation of H. n 
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6.2 Leveraging an approximate sparsifier 

The purpose of this section is to show that provided an approximate sparsifier for a graph G we can 
produce efficiently a sparsifier for G. So far we have been using only low-stretch subgraphs of G to get 
approximations to the effective resistances in G. A key to our fastest algorithm is the realization that 
we can find low-stretch trees that are not necessarily subtrees of the given graph G; the total stretch will 
actually be only a near-linear function of n. This is based on the following Lemma. 

Lemma 6.4 Let H' < H . Then for any tree T 

stretchy (F') < stretchy^). 

Proof. Let A + denote the Moore- Penrose pseudoinverse of a matrix A and \ (A) denote the i th largest 
eigenvalue of A. By Spielman and Woo [15] we know that for any graph G 

stretchy(G) = trace(LG-^y) = K(LqLj,) 

i 

Because Lq and Ly have the same null space (the constant vector), we can write 

\i(L G L+) = Xi(L+ /2 L G Lp 2 ). 

Notice now that we have 

H' (Lp 2 L H ,Lp 2 ) * (Lp 2 L H Lp 2 ). 

This follows easily by definition. It is also easy to prove (see for example [9], Ch. 6.1) that 

A 1 B Xi(A) < Xi(B). 

Hence 

\{L^ 2 LijiL^, 2 ) < \i(L^ 2 Lh L^ 2 ) => 
trace {L^ 2 LuiL^ 2 ) < trace(L^ 2 L^L^ 2 ) =>■ 
stretchy (i/ 7 ) < stretchy (if). 

□ 

We are now ready to prove the main Lemma in this subsection. To avoid confusion we will use mu to 
denote the number of edges of a graph H. 

Lemma 6.5 (Leveraging) Let H be a k- approximation of H' . Suppose we are given H', a 4- approximation 
of H' with 0(n log n) edges. Then we can construct a (1 db e)- approximation of H with 0(sfKn log 3 n) 
edges in 0(itih + 7^i og n) time. We can also construct a (1 ± e)-sparsifier of H with 0(n logra) edges in 
0{m,H + T n \ogn + Sy-Knlog 5 n) time. 

Proof. We compute a low-stretch spanning tree T of H' in 0(T n \ O g n ) time. Because H' has O(nlogn) 
edges we have 

stretchy(ir) = 0(sfn log 2 n). 

We then compute in O(mjj) time the effective resistance R T (e) of each edge e of H over T. This can 
be done in 0(niH) time using off-line LCA algorithms by Gabow and Tarjan[16, 6]. Since H' ^ 4H' and 
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H' ■< H we can apply (3.2) twice and get 



R H (e) < R H '{e) < m H '{e). 
By Rayleigh's monotonicity theorem (e.g. see [10]) we get that: 

R 6 '(e)<R T (e). 

Hence setting p' e = <iw e R T (e) in Theorem 3.5 allows us to sparsify H. To get a (1 ± e)-approximation of 
H, the number of samples we need to take is 0(qlogq/e 2 ) where q = ^2 e Aw e R T (e) = stretchy (H). Since 
H' ^ H' and H ^ kH' , we apply Lemma 6.4 twice and we get that 

stretchr(.ff) < n • stretchr(-ff') < k • stretcbr(ir ) = 0(sfKn log 2 n). 

This proves the first claim. The second claim follows via picking the appropriate e and re-sparsifying the 
first sparsifier with the general case sparsification algorithm. n 

6.3 Sparsifiers for unweighted graphs 

We are now ready to prove our main claims. 

Theorem 6.6 Given an unweighted graph G we can compute an (1 ± e)-spectral sparsifier of G with 
0{nlogn/e 2 ) edges in 0{(m + T n \ogn + s 2 n log 5 n) log log n) time. 

Proof. Let H = G + 0(sf log 3 n)S be the graph provided by Lemma 6.3. We construct a sequence of 
graphs, 

H = Hq, Hi, ... ,Ht = G 

where Hi = G + 0(sf log n)S/2 % , for some appropriate t = O(loglogn). Notice that all graphs Hi have 
m edges, so this takes 0(m log log n) time. For % = 0, . . . , t — 1, let Hi denote a 4-approximation of Hi 
with 0(n log n) edges. By Lemma 6.3 we can compute Hq in 0(m) time. Provided now that we have Hj 
we can apply Lemma 6.5 (with k = 2, e = 1/2 and a proper scaling of the (1 ± e)-sparsifier) to get a 4- 
approximation Hj+i in 0(m + s 2 n log 5 n) time. From Ht-i we produce a (1 ± e)-sparsifier of Ht = G again 
by Lemma 6.5, using the desired value for e. Because we apply the algorithm of Lemma 6.5 O (log log n) 
times, we get the claimed running time. □ 

Theorem 6.7 Given an unweighted graph G we can compute an (1 ± e)-spectral sparsifier of G with 
0(nlogn/e 2 ) edges in 0(m + T n i ogn + s^n log 8 n) time. 

Proof. Let H = G + 0(sf log 3 n)S be constructed as in Lemma 6.3. Notice the smaller scaling factor 
relative to the Lemma. Similarly to Lemma 6.3 we compute a graph / which 2-approximates H and has 
0{n log n + m/(sf log n)) edges; this is slightly more dense than the graph I in the Lemma, due to the 
smaller scale-up factor. We then use our second algorithm to sparsify I and get a 4-approximation H' of 
H with 0(n log n) edges in 0(m + s 2 nlog 6 n) time. Because H is an 0(sf log 3 n)-approximation of G we 
can 'leverage' its sparsifier H' to compute a (1 ± e)-sparsifier of G via Lemma 6.5 (with k = 0(sf log 3 n)) 
in 0{m + Tniogn + s 3 f n log 8 n) time. □ 

6.4 The weighted case: sparsification via decompositions 

We first prove the following Lemma. 
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Lemma 6.8 For any graph G with polynomially bounded weights we can compute in 0(m log log n) time 
a low-stretch subgraph S with 0(n log n) edges such that for all edges e of G, we have 

stretchy (e) = O(logn). 

Proof. Recall our assumption that the graph weights are polynomially bounded. In 0(m log log n) 
time we can decompose G into O(logra) edge-disjoint graphs Gj. Suppose Gj consists of rrij edges with 
weights in [2 J , 2 J+1 ), for j = 0. . . O(logn). We then round the weights in Gj to the nearest power of 2 
and find a spanner Sj using Lemma 6.2. Finally we let S = ^ • Sj. This computation can be performed 
in ^ • 0(mj) = 0(m) time. Now let e be an edge in Gj. We have 

stretchs'(e) < stretchy, (e) = O(logn). 

□ 

This Lemma is enough to generalize Theorem 6.6 to weighted graphs with polynomially bounded 
weights. All we need to do is to replace the spanner S in the proof with the low-stretch subgraph provided 
by Lemma 6.8. 

7 Final Remarks 

We remark that the fastest sparsification algorithms of this paper rely crucially on graph decompositions. 
However it seems natural to conjecture that decompositions are not necessary, and that the same upper 
bound can be obtained via straightforward sampling scheme. We believe that this is an interesting question 
that would potentially lead to a deeper understanding of low-stretch subgraph computations. 

On the other hand the original algorithm of Spielman and Teng remains the only known combinatorial 
sparsification algorithm that does not rely on solving systems. Designing a spectral sparsification algorithm 
that does not depend on a linear system solver and that outputs a very sparse graph with 0(n log n) or 
0(n log 2 n) edges is a challenging open problem. Since achieving this may prove to be difficult, an alternate 
approach could be algorithms that compute very sparse ^-approximations for small values of k. Such 
algorithms could play a significant role in the development of more practical SDD solvers. 
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